Multi-View Dimensionality Reduction via Canonical Correlation Analysis
نویسندگان
چکیده
We analyze the multi-view regression problem where we have two views X = (X, X) of the input data and a target variable Y of interest. We provide sufficient conditions under which we can reduce the dimensionality of X (via a projection) without loosing predictive power of Y . Crucially, this projection can be computed via a Canonical Correlation Analysis only on the unlabeled data. The algorithmic template is as follows: with unlabeled data, perform CCA and construct a certain projection; with the labeled data, do least squares regression in this lower dimensional space. We show how, under certain natural assumptions, the number of labeled samples could be significantly reduced (in comparison to the single view setting) — in particular, we show how this dimensionality reduction does not loose predictive power of Y (thus it only introduces little bias but could drastically reduce the variance). We explore two separate assumptions under which this is possible and show how, under either assumption alone, dimensionality reduction could reduce the labeled sample complexity. The two assumptions we consider are a conditional independence assumption and a redundancy assumption. The typical conditional independence assumption is that conditioned on Y the views X and X are independent — we relax this assumption to: conditioned on some hidden state H the views X and X are independent. Under the redundancy assumption, we show that the best predictor from each view is roughly as good as the best predictor using both views.
منابع مشابه
Kernel CCA for multi-view learning of acoustic features using articulatory measurements
We consider the problem of learning transformations of acoustic feature vectors for phonetic frame classification, in a multi-view setting where articulatory measurements are available at training time but not at test time. Canonical correlation analysis (CCA) has previously been used to learn linear transformations of the acoustic features that are maximally correlated with articulatory measur...
متن کاملSemi-Supervised, Dimensionality Reduction via Canonical Correlation Analysis
We analyze the multi-view regression problemwhere we have two views (X1, X2) of the input data and a real target variable Y of interest. In a semi-supervised learning setting, we consider two separate assumptions (one based on redundancy and the other based on (de)correlation) and show how, under either assumption alone, dimensionality reduction (based on CCA) could reduce the labeled sample co...
متن کاملA unified dimensionality reduction framework for semi-paired and semi-supervised multi-view data
semi-supervised multi-view data Xiaohong Chen, Songcan Chen, Hui Xue , Xudong Zhou 1 Department of Mathematics, Nanjing University of Aeronautics & Astronautics, Nanjing, 210016, China 2 Department of Computer Science and Engineering, Nanjing University of Aeronautics & Astronautics, Nanjing, 210016, China 3 State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210093...
متن کاملMulti-Label Prediction via Sparse Infinite CCA
Canonical Correlation Analysis (CCA) is a useful technique for modeling dependencies between two (or more) sets of variables. Building upon the recently suggested probabilistic interpretation of CCA, we propose a nonparametric, fully Bayesian framework that can automatically select the number of correlation components, and effectively capture the sparsity underlying the projections. In addition...
متن کاملAn Information Theoretic Framework for Multi-view Learning
In the multi-view learning paradigm, the input variable is partitioned into two different views X1 and X2 and there is a target variable Y of interest. The underlying assumption is that either view alone is sufficient to predict the target Y accurately. This provides a natural semi-supervised learning setting in which unlabeled data can be used to eliminate hypothesis from either view, whose pr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008